Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens

نویسندگان

  • Glen A. Satten
  • Somnath Datta
  • Hercules Moura
  • Adrian R. Woolfitt
  • Maria da G. Carvalho
  • George M. Carlone
  • Barun K. De
  • Antonis Pavlopoulos
  • John R. Barr
چکیده

MOTIVATION Application of mass spectrometry in proteomics is a breakthrough in high-throughput analyses. Early applications have focused on protein expression profiles to differentiate among various types of tissue samples (e.g. normal versus tumor). Here our goal is to use mass spectra to differentiate bacterial species using whole-organism samples. The raw spectra are similar to spectra of tissue samples, raising some of the same statistical issues (e.g. non-uniform baselines and higher noise associated with higher baseline), but are substantially noisier. As a result, new preprocessing procedures are required before these spectra can be used for statistical classification. RESULTS In this study, we introduce novel preprocessing steps that can be used with any mass spectra. These comprise a standardization step and a denoising step. The noise level for each spectrum is determined using only data from that spectrum. Only spectral features that exceed a threshold defined by the noise level are subsequently used for classification. Using this approach, we trained the Random Forest program to classify 240 mass spectra into four bacterial types. The method resulted in zero prediction errors in the training samples and in two test datasets having 240 and 300 spectra, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A COMPARATIVE ANALYSIS OF WAVELET-BASED FEMG SIGNAL DENOISING WITH THRESHOLD FUNCTIONS AND FACIAL EXPRESSION CLASSIFICATION USING SVM AND LSSVM

This work presents a technique for the analysis of Facial Electromyogram signal activities to classify five different facial expressions for Computer-Muscle Interfacing applications. Facial Electromyogram (FEMG) is a technique for recording the asynchronous activation of neuronal inside the face muscles with non-invasive electrodes. FEMG pattern recognition is a difficult task for the researche...

متن کامل

Preprocessing of tandem mass spectra using machine learning methods

Protein identification has been more helpful than before in the diagnosis and treatment of many diseases, such as cancer, heart disease and HIV. Tandem mass spectrometry is a powerful tool for protein identification. In a typical experiment, proteins are broken into small amino acid oligomers called peptides. By determining the amino acid sequence of several peptides of a protein, its whole ami...

متن کامل

A Bayesian approach for image denoising in MRI

Magnetic Resonance Imaging (MRI) is a notable medical imaging technique that is based on Nuclear Magnetic Resonance (NMR). MRI is a safe imaging method with high contrast between soft tissues, which made it the most popular imaging technique in clinical applications. MR Imagechr('39')s visual quality plays a vital role in medical diagnostics that can be severely corrupted by existing noise duri...

متن کامل

FT-Raman Spectra of Saffron (Crocus Stivus L.); A Possible Method for Standardization of Saffron

FT-Raman Spectra of Saffron (crocus sativus L.) with a partial assignment is reported. Based on the Raman data, it is concluded that main pigments in saffron are crocins and crocetin. It is proposed that the quickly attainable FT-Raman spectrum of solid saffron, may be used as a means of saffron standardization.

متن کامل

Comparative Analysis of Image Denoising Methods Based on Wavelet Transform and Threshold Functions

There are many unavoidable noise interferences in image acquisition and transmission. To make it better for subsequent processing, the noise in the image should be removed in advance. There are many kinds of image noises, mainly including salt and pepper noise and Gaussian noise. This paper focuses on the research of the Gaussian noise removal. It introduces many wavelet threshold denoising alg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 20 17  شماره 

صفحات  -

تاریخ انتشار 2004